Characterizing polypeptides

ABSTRACT

A method for characterizing polypeptides, which comprises: (a) treating a sample comprising a population of one or more polypeptides with a cleavage agent which is known to recognize a specific amino acid residue or sequence in polypeptide chains and to cleave at a cleavage site, whereby the population is cleaved to generate peptide fragments; (b) isolating a population of the peptide fragments which bear at one end a reference terminus comprising either only a C-terminus or only an N-terminus and which bear at the other end the cleavage site proximal to the reference terminus; and c) determining a signature sequence of at least some of the isolated fragments, which signature sequence is the sequence of a predetermined number of amino acid residues running from the cleavage site; wherein the signature sequence and the relative position of the cleavage site to the reference terminus characterize the polypeptide or each polypeptide.

FIELD OF THE INVENTION

The present invention relates to a method for characterizingpolypeptides and to methods for identifying and assaying suchpolypeptides.

BACKGROUND TO THE INVENTION

The characterisation and identification of polypeptides from complexmixtures thereof, such as protein samples found in biological systems,is a well-known problem in biochemistry. Traditional methods involve avariety of liquid phase fractionation and chromatography steps followedby characterization, for example by two dimensional gel electrophoresis.Such methods are prone to artefacts and are inherently slow. Moreover,automation of such methods is extremely difficult.

Patent Application PCT/GB97/02403, filed on Sep. 5, 1997, describes amethod for profiling a cDNA population in order to generate a`signature` for every cDNA in the population. It is assumed in thatmethod that a short sequence of about 8 bp that is determined withrespect to a fixed reference point is sufficient to identify almost allgenes. This system relies on immobilizing the cDNA population at the 3'terminus and cleaving it with a restriction endonuclease. This leaves apopulation of 3' restriction fragments. The patent describes a techniquethat allows one to determine a signature of roughly 8 to 10 base pairsat a specified number of bases from the restriction site which is asufficient signature to identify nearly all genes.

Techniques for profiling proteins, that is to say cataloguing theidentities and quantities of proteins in a tissue, are less welldeveloped in terms of automation or high throughput. The classicalmethod of profiling a population of proteins is by two-dimensionalelectrophoresis. In this method a protein sample extracted from abiological sample is separated on a narrow gel strip. This firstseparation usually separates proteins on the basis of their iso-electricpoint. The entire gel strip is then laid against one edge of arectangular gel. The separated proteins in the strip are thenelectrophoretically separated in the second gel on the basis of theirsize. This technology is slow and very difficult to automate. It is alsorelatively insensitive in its simplest incarnations. A number ofimprovements have been made to increase resolution of proteins by 2-Dgel electrophoresis and to improve the sensitivity of the system. Onemethod to improve the sensitivity of 2-D gel electrophoresis and itsresolution is to analyse the protein in specific spots on the gel bymass spectrometry. One such method is in-gel tryptic digestion followedby analysis of the tryptic fragments by mass spectrometry to generate apeptide mass fingerprint. If sequence information is required, tandemmass spectrometry analysis can be performed.

More recently attempts have been made to exploit mass spectrometry toanalyze whole proteins that have been fractionated by liquidchromatography or capillary electrophoresis. In-line systems exploitingcapillary electrophoresis mass spectrometry have been tested. Theanalysis of whole proteins by mass spectrometry, however, suffers from anumber of difficulties. The first difficulty is the analysis of thecomplex mass spectra resulting from multiple ionisation statesaccessible by individual proteins. The second major disadvantage is thatthe mass resolution of mass spectrometers is at present quite poor forhigh molecular weight species, i.e. for ions that are greater than about4 kilodaltons in mass so resolving proteins that are close in mass isdifficult. A third disadvantage is that further analysis of wholeproteins by tandem mass spectrometry is difficult as the fragmentationpatterns for whole proteins are extremely complex.

SUMMARY OF THE INVENTION

The present invention provides a method for characterising polypeptides,which comprises:

(a) treating a sample comprising a population of one or morepolypeptides with a cleavage agent which is known to recognise inpolypeptide chains a specific amino acid residue or sequence and tocleave at a cleavage site, whereby the population is cleaved to generatepeptide fragments;

(b) isolating a population of the peptide fragments which bear at oneend a reference terminus comprising either only a C-terminus or only anN-terminus and which bear at the other end the cleavage site proximal tothe reference terminus; and

(c) determining a signature sequence of at least some of the isolatedfragments, which signature sequence is the sequence of a predeterminednumber of amino acid residues running from the cleavage site;

wherein the signature sequence and the relative position of the cleavagesite to the reference terminus characterise the or each polypeptide.

The invention therefore describes a system analogous to that ofPCT/GB97/02403, but for use with proteins. Since there are 20 monomersthat make up a protein there are a great many more possible variants ata particular site in a sequence and so the length of signature requiredfrom a protein sequence is much shorter than that required from a cDNAsequence to identify it uniquely.

This invention can use liquid phase separation techniques and massspectrometry to resolve proteins and protein fragments to facilitateautomation and avoid the artefacts and inherent slowness and lack ofautomation in gel based techniques such as 2-D gel electrophoresis.

The reference terminus may be attached to a solid phase support toimmobilize the population of polypeptides or peptide fragments thereof.Preferably, the population of polypeptide is immobilised beforetreatment with the cleavage agent. In this way, the peptide fragmentsproduced on treatment with the cleavage agent remain immobilized and canbe readily isolated by washing away unwanted material present in theliquid phase. The solid phase support may comprise suitable beads orother such supports well known in this art. Such supports or substratesmay be chosen to bind selectively to either the N-terminus or theC-terminus and this is discussed in further detail below.

In one embodiment, the reference terminus is attached to the solid phasesupport by: (i) treating the polypeptides with a blocking agent to blockall exposed reference groups, which comprise either carboxyl groups orprimary amine groups; (ii) cleaving the reference terminal amino acidsto expose unblocked reference termini; and (iii) treating the unblockedreference termini with an immobilization agent capable of coupling tothe solid phase support; wherein step (b) comprises binding the treatedreference termini to the solid phase support and removing unboundpeptide fragments. In an alternative embodiment, the method furthercomprises (i) preparing the sample step (a) by pre-treating thepolypeptides with a blocking agent to block all exposed referencegroups, which comprise either carboxyl groups or primary amine groups,so that subsequent treatment of the sample with the cleavage agentgenerates peptide fragments bearing unblocked reference termini; (ii)biotinylating the unblocked reference termini; and (iii) binding thepeptide fragments containing the unblocked reference termini to a solidphase support; wherein step (b) comprises eluting unbound peptidefragments therefrom. Preferably, the immobilisation agent comprises abiotinylation agent.

The cleavage agent must recognize a specific amino acid residue orsequence of amino acids reliably. The cleavage site may be at thespecific amino acid residue or sequence or at a known displacementtherefrom. The cleavage agent may be a chemical cleavage agent such ascyanogen bromide. Preferably, the cleavage agent is a peptidase, such asa serine protease, preferably trypsin.

As discussed in further detail below, depending on the number ofproteins or polypeptides in a given sample, it may be advantageous tosort the polypeptides into manageable sub-populations. Sorting can beeffected before treatment of the sample with the cleavage agent or aftercleavage. As discussed in further detail below, the sample of step (a)may comprise a sub-cellular fraction. In this way, the method furthercomprises a step of sub-cellular fractionation before step (a). Thesample of step (a) may be prepared by liquid chromatography of either acrude fraction or a sub-cellular fraction. A preferred method ofdetermining the signature sequence is by mass spectrometry and this maybe preceded by a high pressure liquid chromatography step to resolve thepeptide fragments. Alternatively, the peptide fragments may be subjectedto ion exchange chromatography before step (c), followed by sequencingby either mass spectrometry or other methods.

In accordance with the method of the present invention, thepredetermined number of amino acid residues required to constitute thesignature sequence will vary according to the size of the polypeptidepopulation. Preferably, the predetermined number of amino acid residuesis from 3 to 30, more preferably 3 to 6.

The present invention further provides a method for identifyingpolypeptides in a test sample. The method comprises characterizing thepolypeptides as described above and comparing the signature sequencesand relative positions of the cleavage site obtained thereby with thesignature sequences and relative positions of the cleavage site of knownpolypeptides in order to identify the or each polypeptide in the testsample. This method can be used to identify a single unknown polypeptideor a population of unknown polypeptides by comparing theircharacteristics (i.e. their signature sequences and relative positionsof cleavage site) with those of previously identified polypeptides. Itis envisaged that the database of such characteristic can readily becompiled.

In a further aspect, the present invention provides a method forassaying for one or more specific polypeptides in a test sample. Themethod comprises performing a method as described above, wherein thecleavage agent and relative position of the cleavage site ispredetermined and the signature sequence is determined in step (c) byassaying for a predetermined sequence of amino acid residues runningfrom the cleavage site. Preferably, the cleavage site and signaturesequence are predetermined by selecting corresponding sequences from oneor more known target polypeptides, such as those available from thedatabase.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described in further detail, by way of exampleonly, with reference to the accompanying drawings, in which:

FIGS. 1A-1C shows a reaction scheme according to one embodiment of theinvention;

FIGS. 2A-2C shows a reaction scheme according to another embodiment ofthe invention;

FIGS. 3A-3B shows a reaction scheme according to a simple embodiment ofthe invention; and

FIGS. 4A-4C shows a reaction scheme according to a variation of theembodiment shown in FIG. 1.

BRIEF DESCRIPTION OF THE INVENTION Protein Signatures

The essence of this system is that one can immobilize a population ofproteins onto a solid phase substrate at one terminus of the molecule.Proteins are directional so a particular terminus can be chosen in amanner dependent on the chemistry of the immobilization agent, forexample the Edman reagent (phenyl isothiocyanate) can be usedselectively to remove amino acids from the N-terminus of a protein;however, if phenyl isocyanate is used the N-terminus is simply capped. Aderivative of this molecule that could be coupled to a cleavable linkeron a solid-phase substrate would allow a protein to be immobilized atits N-terminus and subsequently removed by cleavage of the linker.During peptide synthesis, the C-terminus is usually immobilized as abenzyl ester, through the use of a chloromethyl group. Such chemistrymay be adapted to immobilize proteins at this terminus, if desired.

A population of immobilized proteins is then treated with a sequencespecific peptidase such as trypsin to leave a population of N-terminalcleavage fragments. Such fragments can be considered to be analogous toan expressed sequence tag for a protein. One can then sequence theresultant peptide signatures by mass spectrometry. Terminal fragmentsare most meaningful, in that the position of all resultant peptide inthe protein is known and the termini are usually accessible at thesurface of most proteins.

Sorting Proteins

Obviously a population of proteins extracted from a cell is going to bea significant number of distinct species. If, as it is thought there areroughly 15000 genes expressed in the average human cell, one can expectas many proteins. Clearly one cannot sequence all of these by massspectrometry in a single step, with present technology. For this reasona protein population of such size needs to be sorted into manageablesub-sets.

A generalized system for profiling proteins must attempt to resolve aprotein population into reasonable discrete subsets of relativelyuniform size. This is most readily achieved by separation on the basisof global properties of proteins, that vary over a broad and continuousrange, such as size and surface charge, which are the properties usedmost effectively in 2-D gel electrophoresis. Such separations can beachieved as rapidly or more so using liquid chromatographic techniques.In fact, by following one liquid chromatography separation by another,one can resolve proteins in as many dimensions as one requires, sincethere is a great deal more flexibility in liquid chromatographyseparation systems, although one would ideally avoid too many separationsteps to prevent sample loss.

Sorting can be effected during extraction, after extraction of proteinsfrom their source tissue or after cleavage of immobilized peptides.

Sorting during cell fractionation

Proteins are intrinsically sorted in vivo, in terms of theircompartmentalization within a cell. Various techniques are availablethat allow one to sort proteins on the basis of their cellularcompartments. Fractionation protocols involve various cell lysistechniques such as sonication, detergents or mechanical cell lysis thatcan be coupled to a variety of fractionation techniques, mainlycentrifugation. Separation into membrane proteins, cytosolic proteinsand the major membrane bound sub-cellular compartments, such as thenucleus and mitochondria, is standard practice. Thus one can effectivelyignore certain classes of protein if one chooses, e.g. mitochondrialproteins are likely to be uninteresting in a lot of cases. Membrane,cytosolic and nuclear compartments will be of particular interest on thewhole.

Sorting after extraction

Since proteins are highly heterogenous molecules numerous techniques forseparation of proteins are available on the basis of size,hydrophobicity, surface charge and various combinations of the aboveusing liquid chromatography in its various incarnations. Separation iseffected by an assortment of solid phase matrices derivitised withvarious functionalities that adhere to and hence slow down the flow ofproteins through the column on the basis of the properties above.Molecules are normally loaded into such columns in conditions favouringadhesion to the solid phase matrix and selectively washed off insteadily increasing quantities of a second buffer favouring elution. Inthis way the proteins with the weakest interactions with a given matrixelute first.

Various formats for liquid chromatography exist but for greatest speedof throughput and for the most discrete separations High Pressure LiquidChromatography (HPLC) formats are favoured. In this format the matrix isdesigned to be highly incompressible and when derivitized allowchromatographic separation to be performed at extremely high pressureswhich favours rapid and discrete separation.

Sorting of cleaved peptides

Liquid chromatography mass spectrometry (LCMS) is a well developedfield. HPLS systems directly coupled to electrospray mass spectrometersare in widespread use. HPLC is a fast and effective way of resolvingpeptides after they have been cleaved from their immobilized state.

Alternatively sorting peptides by ion exchange chromatography might beadvantageous, in that short peptides could be separated in an almostsequence dependent manner: the amino acids that are ionizable have knownpKa values and hence elution of peptides from such a column at aspecific pH, would be indicative of the presence of particular aminoacids in that sequence. For example, aspartate residues have a pKa of3.9 and glutamate residues 4.3. Elution of a peptide at pH 4.3 would beindicative of the presence of glutamate in the peptide. These effectsare sometimes masked in large proteins but should be distinct in shortpeptide, hence would be extremely useful as sorting features.

Combination of the above techniques will allow various sorting protocolsto be developed that will allow great control over the form of theprotein profile generated. In this way, identification of most proteinsexpressed in a cell should be achievable.

Sequencing of peptides by mass spectrometry

Peptides can be readily sequenced directly by tandem mass spectrometry.In general, peptide mixtures are injected into the mass spectrometer byelectrospray, which leaves them in the vapour phase. The first massspectrometer acts as a filter selecting molecules to enter the secondmass spectrometer on the basis of their mass charge ratio, such thatessentially only a single species enters the second mass spectrometer ata time. On leaving the first mass spectrometer, the selected peptidepasses through a collision chamber, which results in fragmentation ofthe peptide. Since fragmentation occurs mostly at the peptide bond, thepattern of fragments corresponds to a series of subspecies of peptidesand amino acids that compose the original peptide. The distinct patternof masses of single amino acids, 2-mers, 3-mers, etc. generated in thefragmentation of the peptide is sufficient to identify its sequence.

The end result is then that a population of proteins can be arbitrarilysorted into populations of peptides of convenient size to be fed into anelectrospray tandem mass spectrometer for direct sequencing. Completionof such an analysis for an entire cell's proteins would give a profileof what proteins are present and in what relative quantities. Absolutequantitation could be achieved by `spiking` a protein population withknown quantities of particular proteins, known to be absent, e.g. plantproteins in animal samples or visa versa against which to calibrateresults.

Protein Signatures

This invention provides a method of capturing a population of proteinsonto a solid phase substrate by one terminus of each protein in thepopulation. This invention also provides a method of cleaving proteinsthat have been derivatized at one terminus with an agent that can beused to immobilise that terminus on a solid phase substrate. This allowsa single peptide for each protein in a population to be captured onto asolid phase substrate thus peptides from the chosen terminus can beseparated from other peptides generated by the cleavage step and can beisolated. This invention also provides a method to allow all thepeptides generated in a cleavage step that are not from the referenceterminus to be captured leaving a single terminal peptide per proteinfree in solution for analysis.

A population of peptides generated according to the methods of thisinvention can be analysed in a number of ways preferably by massspectrometry.

Two forms of analysis are preferred. The first is to determine peptidemass fingerprints for the population of signature peptides generated. Inthis method the mass of each peptide, preferably the accurate mass, isdetermined. A significant proportion of signature peptides should beuniquely identified by this form of analysis. Any mass peaks that areunknown can be further characterised by the second form of preferredanalysis. Ions of a specific mass can be selected for collision induceddissociation in a tandem mass spectrometer. This technique can be usedto determine sequence information for a peptide.

Capturing Peptides

This invention provides methods that exploit derivitization of proteinswith various agents, including existing peptide sequencing reagents, toisolate a single `signature` peptide from each member of a population ofproteins. This invention may be practiced in two formats. The methods ofthis invention allow a reference terminus to be selected from theproteins in a population. In the first format this reference terminusmay be derivatized with an immobilization agent. If the proteinsderivatized in this manner are treated with a sequence specific cleavageagent to generate peptides, the peptides from the reference termini ofthe proteins in a mixture can be specifically captured leaving theremaining peptides free in solution. This first format is discussed inthe following section headed "Format 1". In the second format a singlepeptide sample per protein is generated by capturing the peptidefragments that are not from the chosen reference terminus, thus leavingthe signature peptides free in solution.

Format 1

In the simplest embodiment of this invention as shown schematically inFIG. 3, a population of proteins is reacted with a modified sequencingagent specific for one terminus of each protein in the population. Themodified sequencing agent carries an immobilization agent in order thatproteins derivatised with the sequencing agent may be captured onto asolid phase support. The captured proteins may then be cleaved with asequence specific cleavage agent. This cleavage step will generate aseries of peptide fragments in solution and will leave a single peptideprotein captured on the solid phase support. The peptides free insolution are then washed away. The immobilized peptides can then bereleased from the solid phase support by completing the sequencingreaction for the coupled terminal amino acid. The Edman reagent (phenylisothiocyanate) could be modified to carry an immobilization agent, thephenyl ring could be substituted with a group linked to an appropriateimmobilisation effector such as biotin. A population of proteinsderivatised with this reagent could be cleaved with trypsin. Thederivatised terminal peptides could then be immobilized on an avidinatedsolid phase support allowing underivatized peptides to be washed away.The peptides could then be released from the solid phase support bydisrupting the avidin-biotin reaction. This will leave N-terminalpeptides free in solution. These peptides can then be analysed by massspectrometry. It may be desirable to fractionate the peptide prior tomass spectrometry but this fractionation step is optional. Alternativelya modified C-terminal sequencing agent might be used to capture proteinsby the C-terminus. The C-terminus is generally not post-translationallymodified and so may be the preferred terminus to capture a population ofproteins. Further embodiments of this invention are discussed below.

C-terminal sequencing agents:

Unmodified C-terminal sequencing agents can be used to generate asignature peptide. A further embodiment of the present invention is asfollows and is described schematically in FIG. 1. In the first step aprotein population extracted from a tissue is loosely immobilized onto amembrane, such as a PVDF membrane. The solvents used to extract proteinsfrom a tissue sample are generally very harsh, usually containing agentssuch as urea, thiourea and detergents, since proteins have widelyvarying solubilities. Immobilizing extracted proteins onto a membraneallows them to be washed with other solvents prior to modification. Theprotein population, thus captured, is then derivatised with a couplingagent, such as diphenyl phosphoroisothiocyanatidate fromHewlett-Packard, (Miller et al., Techniques in Protein Chemistry VI219-227) in a method that is essentially the same as that which onewould use for a normal sequencing reaction for a single protein givingpeptidylacylisothiocyanates for all proteins. The coupling reagent alsoreacts with other free carboxyl groups also giving acylisothiocyanatederivatives. The coupling agent may, however, react incompletely withsome carboxylic acid side chains. It may, therefore, be desirable toperform additional derivatisation steps using more reactive reagents toensure that all free carboxyl groups are derivatised. This variation isshown in FIG. 4. The derivatized protein population is then treated withpyridine to effecting ring closure of the terminal acylisothiocyanatederivative. One can then cleave the C-terminal residue by addition of acleavage agent such as trimethylsilanolate, from Hewlett-Packard, whichcleaves the terminal amino acid from each protein releasing thethiohydantoin-amino acid derivative of the terminal amino acid. Thisexposes a free carboxyl at the penultimate residue of each protein. Thiscan be specifically derivatized with biotin using5-(biotimamido)pentylamine since all other carboxyl groups arederivatized. In this way all the proteins in a population can bederivatised at the C-terminal with biotin. The biotinylated population,still on the PVDF membrane is then treated with an appropriate sequencespecific cleavage agent. Trypsin is generally used for mass spectrometryapplication as this generally leaves the N-terminal side of the cleavagesite protonated which is desirable. Trypsin specifically cleavesadjacent to basic residues. If an enzyme is used the immobilizedpeptides would have to be washed with some form of physiological bufferto allow trypsin to function. This will leave a population of cleavedpeptides, some of which are biotinylated which can be desorbed from thePVDF membrane into solution. The biotinylated peptides can be capturedusing a solid phase matrix derivatized with monomeric avidin.Non-immobilized peptides can then be washed away, leaving an immobilizedpopulation of C-terminal peptides which comprise the tag used toidentify proteins in a population. After washing away free peptides, theimmobilised tags can be released from the solid phase support byaddition of acid which disrupts the biotin/avidin interaction--monomericavidin is best for this purpose. In an alternative embodiment thebiotinylated peptides can be captured on an avidinated support prior tosequence specific cleavage.

N-terminal sequencing agents:

N-termini of a large proportion of cellular proteins are blocked. Forthe purposes of profiling those proteins whose N-termini are not blockedone can use the corresponding N-terminal sequencing agents to derivatizeamino groups including the terminal amino group. The terminal amino acidcan be cleaved and the newly exposed amine at the penultimate amino acidcan be derivatised with an immobilization agent. The biotinylatedproteins can then be cleaved and the terminal signature peptides can becaptured and analyzed. This would however be limited to those N-terminithat are not already blocked.

Format 2

This method is shown schematically in FIG. 2. In this method a reagentthat derivatizes carboxyl residues is used to cap all carboxyl residuesincluding the C-terminal carboxyl group in the protein population ofinterest. The protein population is then cleaved with trypsin or anothersequence specific cleavage reagent that cleaves at the peptide bond togenerate an amino and carboxyl group on the C-terminal fragmentsrespectively. At this stage all peptides except the terminal peptides,which are capped will have a free carboxyl. These free carboxyls can bederivatized with 5-(biotimamido)pentylamine or some other immobilizationagent. If biotin is used then one can capture all the biotinylated nonC-terminal peptides onto a solid phase matrix derivatized with avidin.An avidinated affinity column in-line with a mass spectrometer wouldallow C-terminal peptides to be selectively eluted directly into themass spectrometer for anaylsis.

This technique is equally applicable to generating peptide tags from theN-terminus of a population of proteins. Reagents which derivatize aminegroups can be used to selectively cap all amine groups on a proteinincluding the N-terminal amine group. Cleavage will expose amines innon-terminal peptides which can be derivatized with biotin allowingselective capture of non N-terminal peptides. This is important sincemany proteins are modified at the N-terminus and the N-terminal amine isoften inaccessible to reagents. Thus selectively capturing nonN-terminal peptides is a means of generating a signature at theN-terminus.

The reagents to derivatise amines and carboxyls are also simpler thanthose necessary for the coupling agents used in sequencing reactions.

Immobilisation Agents

It is possible to capture derivatised peptides with a variety ofchemical agents. In the discussion of the methods of this inventionbiotin has been chosen as an exemplary immobilisatioin agent due to itshighly specific interactions with avidin. Other immobilisation agentsbesides biotin are compatible with the methods of this invention. Thefollowing are examples and the invention is not limited to these.

A linker to hexahistidine would allow peptide tags to be captured onto acoordinated metal ion derivatized column. Various antibody antigeninteractions could be used as well where an antibody or antigen istagged onto the penultimate amino acid rather than biotin.

Antibodies against derivatives:

The most common N-terminal modification is acetylation. It should bepossible to raise an antibody against N-terminally acetylated peptidesto permit these to be captured using an affinity column derivatized withsuch an antibody. In order to capture substantially al proteins one canderivatize the remaining proteins in a sample, that are not alreadyacetylated, with an acetylation agent. The derivatized proteins can thenbe cleaved with chymotrypsin or another sequence specific agent (trypsindoes not cleave acetylated cleavage sites of proteins). Ananti-N-terminal acetylation antibody immobilised on an appropriatematrix could be used to generate an affinity column. Such a column couldbe used to capture peptide signatures with acetylated N-termini aftertheir source proteins have been cleaved.

To capture C-terminal peptides one could raise an antibody againstthiohydantoin derivatives of peptides which could be used to selectivelycapture a peptide from a protein that had been derivatised with acoupling agent for sequencing prior to cleavage with trypsin or anothersequence specific cleavage agent.

Derivitisation of proteins:

The methods of this invention include derivitization steps which arerequired to ensure that the reference terminus of each protein in apopulation is specifically derivatised with an immobilisation agent inthe first format of, in the second format, to ensure that the referenceterminus is specifically blocked from reaction with an immobilisationagent. Additional derivitization steps may also be performed. These maybe desirable if fractionation of signature peptides is to be performedprior to mass spectrometry analysis. There are two important factorsthat should be considered with regard to any fractionation steps. Thesefactors are the resolution of the fractionation step and the consequentsample loss imposed by the fractionation.

Certain chromatographic techniques are `sticky` when used for theseparation of peptides, that is to say a proportion of the sample isretained on the separation matrix. It is possible to reduce sample lossof this kind by derivitizing the groups that are involved in adhesion tothe separation matrix. That is to say, if one is using an ion exchangechromatography separation one can derivatize ionic and polar side chainswith reagents that increase their hydrophobicity thus reducing affinityto the matrix. This will however, reduce the resolution of theseparation.

It is desirable to ensure that only one mass peak per peptide appears inthe mass spectrum generated by analysis of a population of signaturepeptides. It may, therefore, be desirable to derivatise polar and ionicside chains of signature peptides in order to reduce the number ofionization states accessible to those peptides. This step should helppromote the formation of a single ion species per signature peptide.

It may also be desirable to add a group to each signature peptide toincrease the sensitivity of the mass spectrometry analysis. Aparticularly good `sensitizing` group to add to a peptide would be atertiary ammonium ion which is a positively charged entity withexcellent detection properties.

Pre-Sorting Steps:

This technology can be used to profile peptide populations generated innumerous ways. Various fractionation techniques exist to sub-sortproteins on the basis of certain features. Of particular interest is theanalysis of signaling pathways. Phosphorylation of proteins by kinasesis a feature of many signalling pathways. Proteins that can bephosphorylated by a kinase often have a short phosphorylation motif thata kinase recognizes. Antibodies exist that bind to such motifs, somebinding phosphorylated forms while others bind the non-phosphorylatedstate. Antibody affinity columns or immuno-precipitation of kinasetarget sub-populations followed by profiling would be of great interestin identifying these proteins and in monitoring their metabolismsimultaneously in time resolved studies of live model systems.

Many proteins exists as complexes and analysis of such complexes isoften tricky. A cloned protein that is a putative member of a complexallows one to generate an affinity column with that protein to trapother proteins that bind to it. This profiling technology is eminentlysuited to analysis of such captured protein complexes.

Kits including antibody affinity columns to analyse signal transductionor membrane location by capturing proteins with the appropriatepost-translational modifications are envisaged either as a pre-sortingstep or as a capture step after cleavage of a protein population with asequence specific cleavage agent.

Chromatographic techniques:

having generated peptide tags from a population of proteins it is thendesirable to analyse the resultant tags. Chromatography is an optionalstep in the analysis of a population of peptide signatures prior to massspectrometry but may be quite desirable depending on the configurationof the mass spectrometer used.

Two important features are required of any chromatographic stage in aprotein profiling method, high resolution and minimal sample loss.Resolution generates information and also reduces the complexity of thepeptide tag population entering the mass spectrometer. The secondfeature is that there is minimal loss of sample in the chromatographicseparation, that would reduce the sensitivity of the technique to lowfrequency peptides in the population under analysis.

Derivitisation of proteins:

Certain chromatographic techniques are `sticky` when used for theseparation of peptides, that is to say a proportion of the sample isretained on the separation matrix. To reduce sample loss of this kind ispossible by derivitizing the groups that are involved in adhesion to theseparation matrix. That is to say, if one is using an ion exchangechromatography separation one can derivitise ionic and polar side chainswith reagents that increase their hydrophobicity thus reducing affinityto the matrix. This feature needs to be balanced against the need forresolution though.

The use of the C-terminal sequencing agents to derivitise the freecarboxyl groups which will reduce the adhesion between such peptides anda cation exchange resin. This may mean that cation exchangechromatography may be advantageous as a chromatographic separation step.

One can derivities quite readily acetylate amine residues to achievesimilar effects for anion exchange chromatography.

Analysis of Peptides by mass Spectrometry

Ionisation Techniques:

In general peptide mixtures are injected into the mass spectrometer byelectrospray or MALDI TOF, which leaves them in the vapour phase.

Electrospray Ionisation:

Electrospray ionization requires that the dilute solution of biomoleculebe `atomized` into the spectrometer from an insertion probe, i.e. in afine spray. The solution is, for example, sprayed from the tip of aneedle in an electrostatic field gradient. The mechanism of ionizationis not fully understood but it through to work broadly as follows. Theelectrostatic field charges droplets formed at the probe tip promotingatomization. In the stream of nitrogen the solvent is evaporated. With asmall droplet, this results in concentration of the biomolecule. Giventhat most biomolecules have a net charge this increase the electrostaticrepulsion of the dissolved protein. As evaporation continues thisrepulsion ultimately becomes greater than the surface tension of thedroplet and the droplet `explodes` into smaller droplets. Theelectrostatic field helps to further overcome the surface tension of thecharged droplets. The evaporation continues from the smaller dropletswhich, in turn, explode iteratively until essentially the biomoleculesare in the vapour phase, as is all the solvent.

Atmospheric Pressure Chemical Ionization:

An ionization technique appropriate for use the LCMS, for analysingpeptides is Atmospheric Pressure Chemical Ionisation (APCI). This is anelectrospray based technique where the ionisation chamber is modified toinclude a discharge electrode which can be used to ionize the bath gaswhich in turn will collide with the vaporized sample moleculesincreasing ionization of the sample.

Fast Atom Bombardment:

This is an ionisation technique that is quite similar to APCI and ishighly compatible with samples in solution. Typically, a continuous flowof liquid from a capillary electrophoresis column or an HPLC column canbe pumped through an insertion probe to a hole or a frit at its tipwhere the solution is bombarbed by accelerated atoms or ions, usually ofxenon or caesium. Collision with the dissolved sample results intransfer of kinetic energy to and ionization of the sample.

Matrix Assisted Laser Desorption Ionization (MALDI):

MALDI requires that the biomolecule solution be embedded in a largemolar excess of an photo-excitable `matrix`. The application of laserlight of the appropriate frequency (266 nm beam for nicotinic acid)results in the excitation of the matrix which in turn leads toexcitation and ionization of the embedded biomolecule. This techniqueimparts a significant quantity of translational energy to ions, buttends not to induce excessive fragmentation despite this. Acceleratingvoltages can again be used to control fragmentation with this techniquethough.

MALDI techniques can be supported in two ways. One can proteins in aMALDI matrix, where the proteins themselves are not specificallyexcitable by laser or one can construct peptide labels that contain thenecessary groups to allow laser energization. The latter approach meansthe labels do not need to be embedded in a matrix before performing massspectrometry. Such groups include nicotinic, sinapinic or cinnamic acidmoieties. MALDI based cleavage of labels would probably be mosteffective with a photocleavable linker as this would avoid a cleavagestep prior to performing MALDI mass spectrometry. The various excitableionization agents have different excitation frequencies so that adifferent frequency can be chosen to trigger ionization from that usedto cleave the photolysable linker. These excitable moieties are easilyderivitised using standard synthetic techniques in organic chemistry solabels with multiple masses can be constructed in a combinatorialmanner.

All of the above techniques are routinely used with peptides andproteins and are preferred methods of ionization with this invention.

Mass Spectrometric Sensitivity and Quantitation of Peptide Tags

The end result is then that a population of proteins can be arbtitrarilysorted into populations of peptides of convenient size to be fed into amass spectrometer for analysis. Completion of such an analysis for anentire cell's proteins would give a profile of what proteins are presentand in what relative quantities. Absolute quantitation could be achievedby `spiking` a protein population with known quantities of particularproteins, known to be absent, e.g. plant proteins in animal samples orvisa versa, against which to calibrate results. Internal quantities canbe determined by measuring relative quantities of certain proteinspresent at relatively fixed concentrations in most cells such ashistones. Various techniques coupled to certain mass spectrometergeometries permit good quantitation with a mass spectrometer. Theseissues are dealt with fully in GB 9719284.3.

Mass Analyser Geometries

Mass spectrometry is a highly diverse discipline and numerous massanalyzer configurations exist and which can often be combined in avariety of geometries to permit analysis of complex organic moleculessuch as the peptide tags generated with this invention.

Accurate Mass Measurement

Double focussing mass spectrometers are capable of measuring molecularmasses to a very high accuracy, i.e. fractions of a dalton. This permitsone to distinguish molecules with identical integer mass but differentatomic compositions with ease as fractional differences in the mass ofdifferent atomic isotopes allow such distinctions. For determining themolecular masses of a population of peptide tags, this technique may bevery effective as it would allow identification of a significantproportion of peptides without requiring any sequencing even if some dohave the same integral mass. The few ambiguous peptides that remaincould be analysed by tandem mass spectrometry as discussed below.

Sequencing of peptide tags of Tandem mass spectrometry:

Peptides can be readily sequenced by tandem mass spectrometry. Tandemmass spectrometry describes a number of techniques in which a ions froma sample are selected by a first mass analyzer on the basis of theirmass charge ratio for further analysis by induced fragmentation of thoseselected ions. The fragmentation products are analysed by a second massanalyzer. The first mass analyser in a tandem instrument acts as afilter selecting ions to enter the second mass analyser on the basis oftheir mass charge ratio, such that essentially a species of only asingle mass/charge ratio, usually only a single peptide ion, enter thesecond mass analyser at a time. On leaving the first mass analyzer, theselected peptide passes through a collision chamber, which results infragmentation of the peptide. Since fragmentation occurs mostly at thepeptide bond, the pattern of fragments corresponds to a series ofsubspecies of peptides and amino acids that compose the originalpeptide. The distinct pattern of masses of single amino acids, 2-mers,3-mers, etc. generated in the fragmentation of a peptide is sufficientto identify its sequence.

    ION SOURCE→COLLISION CELL→MS2→ION DETECTOR

Various tandem geometries are possible. Conventional `sector`instruments can be used where the electric sector provide the first massanalyzer stage, the magnetic sector provides the second mass analyser,with a collision cell placed between the two sectors. This geometry isnot ideal for peptide sequencing. Two complete sector mass analyzersseparated by a collision cell could be used for peptide sequencing. Amore typical geometry used is a triple quadrupole where the firstquadrupole filters ions for collision. The second quadrupole in a triplequadrupole acts as a collision chamber while the final quadrupoleanalyses the fragmentation products. This geometry is quite favorable.Another more favorable geometry is a Quadrupole/Orthogonal Time ofFlight tandem instrument where the high scanning rate of a quadrupole iscouple to the greater sensitivity of a TOF mass analyser to identify theproducts of fragmentation.

Sequencing with Ion Traps:

Ion Trap mass spectrometers are a relative of the quadrupolespectrometer. The ion trap generally has a 3 electrode construction--acylindrical electrode with `cap` electrodes at each end forming acavity. A sinusoidal radio frequency potential is applied to thecylindrical electrode while the cap electrodes are biased with DC or ACpotential. Ions injected into the cavity are constrained to a stablecircular trajectory by the oscillating electric field of the cylindricalelectrode. However, for a given amplitude of the oscillating potential,certain ions will have an unstable trajectory and will be ejected fromthe trap. A sample of ions injected into the trap can be sequentiallyejected from the trap according to their mass/charge ratio by alteringthe oscillating radio frequency potential. The ejected ions can then bedetected allowing a mass spectrum to be produced.

Ion traps are generally operated with a small quantity of a `bath gas`,such as helium, present in the ion trap cavity. This increases both theresolution and the sensitivity of the device by collision with trappedions. Collisions both increase ionisation when a sample is introducedinto the trap and damp the amplitude and velocity of ion trajectorieskeeping them nearer the centre of the trap. This means that when theoscillating potential is changed, ions whose trajectories becomeunstable gain energy more rapidly, relative to the damped circulatingions and exit the trap in a tighter bunch giving a narrower largerpeaks.

Ion traps can mimic tandem mass spectrometer geometries, in fact theycan mimic multiple mass spectrometer geometries allowing complexanalyses of trapped ions. A single mass species from a sample can beretained in a trap, i.e. all other species can be ejected and then theretained species can be carefully excited by super-imposing a secondoscillating frequency on the first. The excited ions will then collidewith the bath gas and will fragment if sufficiently excited. Thefragments can then be analysed further. One can retain a fragment ionfor further analysis by ejecting other ions and then exciting thefragment ion to fragment. This process can be repeated for as long assufficient sample exists to permit further analysis. It should be notedthat these instruments generally retain a high proportion of fragmentions after induced fragmentation. These instruments and FTICR massspectrometers (discussed below) represent a form of temporally resolvedtandem mass spectrometry rather than spatially resolved tandem massspectrometry which is found in linear mass spectrometers.

For the purposes of protein profiling a peptide population, an ion trapis quite a good instrument. A sample of peptide tags can be injectedinto the spectrometer. Peptide tags that are expected to appear in aprofile, such as housekeeping proteins or histone peptides fromeukaryote cell samples, can be ejected specifically and quantifiedrapidly. The remaining peptides can be scanned. Totally new peptides canthen be selectively retained from subsequent samples of the peptidepopulation and can be induced to fragment allowing sequence data for thepeptide to be acquired. Alternatively an Ion Trap can form the firststage of a tandem geometry instrument.

Fourier Transform Ion Cyclotron Resonance Mass Spectrometry (FTICR MS):

FTICR mass spectrometry has similar features to ion traps in that asample of ions is retained within a cavity but in FTICR MS the ions aretrapped in a high vacuum chamber by crossed electric and magneticfields. The electric field is generated by a pair of plate electrodesthat form two sides of a box. The box is contained in the field of asuperconducting magnet which in conjunction with the two plates, thetrapping plates, constrain injected ions to a circular trajectorybetween the trapping plates, perpendicular to the applied magneticfield. The ions are excited to larger orbits by applying aradiofrequency pulse to two `transmitter plates` which form two furtheropposing sides of the box. The cycloidal motion of the ions generatecorresponding electric fields in the remaining two opposing sides of thebox which comprise the receiver plates'. The excitation pulses exciteions to large orbits which decay as the coherent motions of the ions islost through collisions. The corresponding signals detected by thereceiver plates are converted to a mass spectrum by fourier transformanalysis.

For induced fragmentation experiments these instruments can perform in asimilar manner to an ion trap--all ions except a single species ofinterest can be ejected from the trap. A collision gas can be introducedinto the trap and fragmentation can be induced. The fragment ions can besubsequently analysed. Generally fragmentation products and bath gascombine to give poor resolution if analysed by FT of signals detected bythe `receiver plates`, however the fragment ions can be ejected from thecavity and analyzed in a tandem configuration with a quadrupole, forexample.

For protein profiling FTICR MS could be used and may be advantageous asthese instruments have a very high mass resolution allowing for accuratemass measurement so that peptides with the same integer mass butdifferent atomic compositions can be resolved. Furthermore unidentifiedpeptide tags can be subsequently analyzed by fragmentation.

Protein Immobilisation

A great deal of knowledge has been accumulated about specific proteinchemistries particularly in the area of organic synthesis of peptides.

R. B. Merrifield, Science 232: 341-347, 1986.

S. B. H. Kent, "Chemical Synthesis of Peptides and Proteins", Annu. Rev.Biochem. 1988. 57: 957-989.

Linkers

An important feature of this invention is cleavable linkers to theirrelevant biomolecules. Photocleavable linkers are particularly desirableas they allow for rapid, reagentless cleavage. For references, see:

Theodora W. Greene, "Protective Groups in Organic Synthesis", 1981,Wiley-Interscience.

On photoremovable groups:

Patchornik, J. Am. Chem. Soc. 92: 6333 -, 1970.

Amit et al, J. Org. Chem. 39: 192 -, 1974.

Liquid Chromatography:

R. Scopes, "Protein Purification: Principles and Practice",Springer-Verlag, 1982.

M. Deutscher, "Guide to Protein Purification", Academic Press, 1990.

Mass Spectrometry:

Electrospray mass spectrometry is the preferred technique for sequencingpeptides since it is a very soft technique and can be directly coupledto the liquid phase molecular biology used in this invention. For a fulldiscussion of mass spectrometry techniques see:

K. Biemann, "Mass Spectrometry of Peptides and Proteins", Annu. Rev.Biochem. 1992, 61: 977-1010.

R.A.W. Johnstone and M. E. Rose,"Mass Spectrometry for chemists andbiochemists"2nd edition, Cambridge University Press, 1996.

EXPERIMENT Outline of Embodiment of Protein Profiling

This comprises a system where

(i) a protein has its carboxyl groups protected, the last amino acidremoved leaving just one carboxyl group free at the cleaved terminus.

(ii) This will be reacted with a biotinylation reagent, so that thecarboxy terminus is labelled with biotin.

(iii) The protein is fragmented with a protease to leave peptidefragments, only the carboxyl one being biotinylated.

The biotin is used to attach the C terminal fragment to immobilisedstreptavidin, or preferably monomeric avidin, from which it can bereleased with mild acid and made available for MS--MS.

All reagents are available, and the chemistry is generally well-known asfollows:

(i) The technique of carboxy-terminal sequencing of proteins isestablished. We note that the method of Boyd et al., (Boyd, V L,Bozzini, M, Guga, P J, DeFranco, R J, Yuan, P-M, Loudon, G M and Nguyen,D; J. Org Chem, 60, 2581, (1995)) blocks the side chain carboxyls ofaspartate and glutamate residue by amidation during removal of theterminal amino acid.

(ii) Biotinylation of the free carboxyl group at the carboxy terminusmay be achieved using5-(biotimamido)pentylamine/1-ethyl-3-[3-dimethylaminopropyl]carbodiimidehydrochloride, which is marketed by Pierce & Warriner (Lee, K Y,Birckbichler, P J and Patterson, M K, Clin Chem, 34, 906 (1988) for sucha purpose.

(iii) Protease fragmentation of proteins on a membrane is an establishedtechnique (Sutton, C W, Pemberton, K S, Cottrell, J S, Corbett, J M,Wheeler, C H, Dunn, M J and Pappin, D J, Electrophoresis, 16, 308,(1995), and Millipore Corporation produce Immobilon-CD and other PVDFmembranes for that purpose. Monomeric avidin is produced by Pierce andWarriner, and allows release of biotinylated molecules using 2 mM biotinin phosphate buffered saline.

The remaining step in the method is the use of PVDF membranes (as usedfor trypsinization) in lieu of Zitex membranes for the sequencingreaction (I).

METHODOLOGY Binding of Lysozyme of PVDF Membrane

0.5 mm squared pieces of PVDF (Millipore) were wetted with isopropanoland incubated in 20 mg/ml lysozyme (Pharmacia) in PBS at roomtemperature for 30 minutes. The membranes were then air dried and storedat 4° C. until used.

Modification (Carboxyl Group Protection) of Lysozyme Bound to PVDF .

Modification solution was prepared by mixing 62 mg of2-ethyl-5-phenylisoxazolium-3'-sulfonate (Aldrich) with 50 ul ofdiisopropylethylamine (Aldrich) in 2 mls of CH₃ CN 100 ul ofmodification solution was added to each membrane and incubated at roomtemperature for 4 hours.

Following incubation 900 ul of water was added and each membrane wasgently shaken at room temperature for 30 minutes. Each membrane was thentransferred to 50 ul of CH₃ CN, 450 ul of water was added and themembranes were gently shaken at room temperature for 30 minutes.

The each membrane was then transferred to 500 ul of 2% trifluoroaceticacid and incubated at room temperature overnight.

Trypsin Digest

Each membrane was transferred to 250 ul of 25 mM ammonium bicarbonatepH7.6 solution and gently shaken at room temperature for 15 minutes.

Each protein/protein containing membrane was added/transferred to 200 ulof ammonium bicarbonate solution pH7.6 containing 5 ug of trypsin andincubated at 37° C. overnight.

Eluation of Protein/Peptide Fragments From Membrane

Each membrane was transferred to 100 ul of 50% formic acid/50% ethanolsolution and incubated at room temperature for 30 minutes to remove theprotein/peptides. The membranes were then removed and 300 ul of wateradded to the 50% formic acid/50% ethanol solution containing theprotein/peptides.

Analysis

The following were analyzed by reversed phase HPLD.

40 ug of trypsin in PBS; 40 ug of lysozyme in PBS; 40 ug of lysozymedigested with trypsin; 40 ug of trypsin digested with trypsin; membranebound modified lysozyme digested with trypsin; membrane put through themodification protocol without lysozyme and digested with trypsin;membrane bound lysozyme unmodified and digested with trypsin; membranebound lysozyme modified without trypsin digestion.

Results

We have now performed the operation using PVDF membranes in lieu ofZitex membranes for the sequencing reaction (I). We have found that thereversed phase HPLC chromatogram for lysozyme (used as a typicalprotein) obtained after treatment with the sequencing reactions on aPVDF membrane and trypsinization, from which the chromatogram for thesame process in the absence of lysozyme has been subtracted, is similarto that obtained for lysozyme trypsinized directly. Hence thetechnologies are compatible and can be used to generate `signature`peptides for MS--MS identification (data not shown).

KEY TO THE DRAWINGS

FIGS. 1A-1C

Step 1: Extract proteins with harsh solvents and capture extractedproteins onto a PVDF membrane

Step 2: Loosely immobilized proteins can be washed to dispose of harshsolvents

Step 3: Treat proteins with C-terminal coupling agent

Step 4: Treat derivitized proteins with cyclisation reagent and thencleave terminal amino acid from derivitised protein

Step 5: Biotinylate newly exposed penultimate amino acid carboxyl group

Step 6: Wash membrane bound proteins to remove chemical agents andcleave proteins with trypsin in physiological buffer

Step 7: Capture terminal fragments onto avidinated beads

Step 8: Wash away free peptides then release captured peptide `tags` foranalysis

Step 9: Analyse by MS or LC/MS/MS or MS/MS

FIGS. 2A-2C

Step 1: Extract proteins with harsh solvents and capture extractedproteins onto a PVDF membrane

Step 2: Loosely immobilized proteins can be washed to dispose of harshsolvents

Step 3: Treat proteins with C-terminal coupling agent

Step 4: Wash membrane bound proteins to remove chemical agents andcleave proteins with trypsin or other sequence specific cleavage agentin in physiological buffer

Step 5: Biotinylate newly exposed carboxyl termini

Step 6: Capture terminal fragments onto avidinated beads in an affinitycolumn of example

Step 7: Analyse elute C-terminal by MS or LC/MS/MS or MS/MS

FIGS. 3A-3B

Step 1: Extract proteins with harsh solvents and capture extractedproteins onto a PVDF membrane

Step 2: Loosely immobilized proteins can be washed to dispose of harshsolvents

Step 3: Treat proteins with C-terminal coupling agent carryingimmobilization effector

Step 4: Wash membrane bound proteins to remove chemical agents andcleave proteins with trypsin in physiological buffer

Step 5: Capture terminal fragments onto avidinated beads

Step 6: Wash away free peptides then release captured peptide `tags` foranalysis

Step 7: Analyse by MS or LC/MS/MS or MS/MS

FIGS. 4A-4C

Step 1: Extract proteins with harsh solvents and capture extractedproteins onto a PVDF membrane

Step 2: Loosely immobilized proteins can be washed to dispose of harshsolvents

Step 3: Treat proteins with C-terminal coupling agent

Step 4: Treat coupled proteins with derivitisation reagent to ensure allexposed carboxyls are capped

Step 5: Treat derivitized proteins with cyclisation reagent and thencleave terminal amino acid from derivitised protein

Step 6: Biotinylate newly exposed penultimate amino acid carboxyl group

Step 7: Wash membrane bound proteins to remove chemical agents andcleave proteins with trypsin in physiological buffer

Step 8: Capture terminal fragments onto avidinated beads

Step 9: Wash away free peptides then release captured peptide `tags` foranalysis

Step 10: Analyse by MS or LC/MS/MS or MS/MS

What is claimed is:
 1. A method for characterizing polypeptides, whichcomprises:(a) treating a sample comprising a population of a pluralityof polypeptides with a cleavage agent which is known to recognize inpolypeptide chains a specific amino acid residue or sequence and tocleave at a cleavage site, whereby the population is cleaved to generatepeptide fragments; (b) isolating a population of peptide fragments whichcomprises only terminal peptide fragments bearing as a referenceterminus the N-terminus or the C-terminus of the polypeptide from whichfragments were derived, each peptide fragment bearing at the other endthe cleavage site proximal to the reference terminus; and (c)determining by mass spectrometry a signature sequence of at least someof the isolated fragments, which signature sequence is the sequence of apredetermined number of amino acid residues running from the cleavagesite;wherein a signature sequence characterize each polypeptide.
 2. Themethod according to claim 1, wherein the reference terminus is attachedto a solid phase support to immobilized the population of polypeptidesor peptide fragments thereof.
 3. The method according to claim 2,wherein the population of polypeptides is immobilized before treatmentwith the cleavage agent.
 4. The method according to claim 2 or claim 3,wherein the reference terminus is attached to the solid phase supportby: (i) treating the polypeptides with a blocking agent to block allexposed reference groups, which comprise either carboxyl groups orprimary amine groups; (ii) cleaving the reference terminal amino acidsto expose unblocked reference termini; and iii) treating the unblockedreference termini with an immobilisation agent capable of coupling tothe solid phase support; wherein step (b) comprises binding the treatedreference termini to the solid phase support and removing unboundpeptide fragments.
 5. The method according to claim 1, which furthercomprises(i) preparing the sample step (a) by pre-treating thepolypeptides with a blocking agent to block all exposed referencegroups, which comprise either carboxyl groups or primary amine groups,so that subsequent treatment of the sample with the cleavage agentgenerates peptide fragments bearing unblocked reference termini; (ii)treating the unblocked reference termini with an immobilization agentcapable of coupling to a solid phase support; and (iii) binding thepeptide fragments containing the unblocked reference termini to thesolid phase support; wherein step (b) comprises eluting unbound peptidefragments therefrom.
 6. The method according to claim 4, wherein theimmobilization agent comprises a biotinylation agent.
 7. The methodaccording to claim 4, wherein the reference group is carboxyl.
 8. Themethod according to claim 1, wherein the cleavage agent comprises apeptidase.
 9. The method according to claim 1, wherein the sample ofstep (a) comprises a sub-cellular fraction.
 10. The method according toclaim 1, which further comprises preparing the sample of step (a) byliquid chromatography.
 11. The method according to claim 1, wherein themass spectrometry is preceded by a high pressure liquid chromatographystep to resolve the peptide fragments.
 12. The method according to claim1, wherein the peptide fragments are subjected to ion exchangechromatography before step (c).
 13. The method according to claim 1,wherein the predetermined number of amino acid residues is from 3 to 30.14. A method for identifying polypeptides in a test sample, whichcomprises characterizing the polypeptides in accordance with a methodaccording to claim 1, comparing the signature sequences and relativepositions of the cleavage site obtained thereby with the signaturesequences and relative positions of the cleavage site of furtherpolypeptides in order to identify the or each polypeptide in the testsample.
 15. A method for assaying for one or more specific polypeptidesin a test sample, which comprises performing a method according to claim1, wherein the cleavage agent and relative position of the cleavage siteis predetermined and the signature sequence is determined in step (c) byassaying for a predetermined sequence of amino acid residues runningfrom the cleavage site.
 16. A method according to claim 15, wherein thecleavage site and signature sequence are predetermined by selectingcorresponding sequences from one or more known target polypeptides.